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Abstract 
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1 Introduction 



The aim of this paper is to illustrate the efficiency of Stochastic Approximation (SA) Theory by 
revisiting several recent results on randomized urn models applied to clinical trials (especially [5, 6, 
7]). We will first retrieve the a.s. convergence (strong consistency) and asymptotic normality results 
obtained in these papers under less stringent assumptions. Then we will take advantage of this 
more synthetic approach to establish a new Central Limit Theorem (CLT) in the more sophisticate 
randomized urn model known as "multi-arm clinical test" . In this model, the urn updating which 
produces the adaptive design is based on statistical estimators of the past efficiency of the assigned 
treatments. 

In these adaptive models, the starting point is the equation which governs the urn composition 
updated after each new treated patient. Basically, we will show that a normalized version of this 
urn composition can be formulated as a classical recursive stochastic algorithm with step 'Yn = ^ 
which classical Stochastic Approximation Theory deals with. Doing so we will be in position to 
establish the a.s. convergence of the procedure by calling upon the so-called Ordinary Differential 
Equation Method {ODE method) and to derive the asymptotic normality - a CLT, to be precise 
- from the standard CLT for stochastic algorithms (sometimes called the Stochastic Differential 
Equation Method {SDE method), see e.g. [14, 9]). These two main theoretical results are recalled 
in a self-contained form in the Appendix. They can be found in all classical textbooks on SA ([9], 
[13], [14], [22]) and go back to [21] and [11]. SA Theory is also used in clinical trials to solve 
dose-finding problems (see for example [12] and citations therein). 

Clinical trials essentially deal with the asymptotic behaviour of the patient allocation to several 
treatments during the procedure. Adaptive designs in clinical trials aim at detecting "on line" which 
treatment should be assigned to more patients, while keeping randomness enough to preserve the 
basis of treatments. This adaptive approach relies on the cumulative information provided by the 
responses to treatments of previous patients in order to adjust treatment allocation to the new 
patients. To this end, many urn models have been suggested in the literature (see [20], [28], [27], 
[15] and [25]). The most widespread random adaptive model is the Generalized Friedman Urn 
{CFU) (see [2] and more recently [19, 24]), also called Generalized Polya Urn [CPU). The idea 
of this modeling is that the urn contains balls of d different types representative of the treatments. 
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All random variables involved in the model are supposed to be defined on the same probability 
space {Q,A,¥). Denote Yq = (yo*)«=iv,(i ^ ^+ \ {0} the initial composition of the urn, where Yq 
denotes the number of balls of type i, i = 1, . . . ,d (of course a more realistic though not mandatory 
assumption would be Yq G N"^ \ {0}). The allocation of the treatments is sequential and the urn 
composition at draw n is denoted by Yn = (y^)i=i,...,d- When the n*'^ patient presents, one draws 
randomly {i.e. uniformly) a ball from the urn with instant replacement. If the ball is of type j, then 
the treatment j is assigned to the n^^ patient, j = 1, . . . , d, n > I. The urn composition is updated 
by taking into account the response of the n*'* patient to the treatment j, or the responses of all 
patients up to the n*'' one (i.e. the efficiency of the assigned treatment), namely by adding Dn balls 
of type i, i = 1, . . . ,d. The procedure is iterated as long as patients present. Consequently the larger 
the number of balls of a given type is, the more efficient the treatment is. The urn composition at 
stage n, modeled by an M'^-valued vector Y^, satisfies the following recursive procedure: 

Yn = Yn-l + DnXn, U > 1, yoGK+\{0}, (1.1) 

with Dn = {Dn)i<ij<d is the addition rule matrix and X„ is the result of the n*'^ draw and 
Xn ■ {0.,A,¥) — {e^,--- ,e'^} models the selected treatment ({e^,--- ,e'^} denotes the canonical 
basis of M'^ and stands for treatment j). We assume that there is no extinction i.e. Yn S \ {0} 
a.s. for every n > 1: so is the case if all the entries Dn are a.s. nonnegative, but other settings can 
also be taken under consideration (see Section 2). We model the drawing in the urn by setting 

d 

Xn = yZl(y,-lyi ) , n>l, (1.2) 



<C/n< 



v^d yt " — v^a 



where {Un)n>i is i-i.d. with distribution Ui ~^o,i]- 



Let J-n = o"(yO) Uk,Dk, 1 < k < n) he the filtration of the procedure. The generating matrices 
are defined as the J>i-compensator of the additions rule sequence i.e. 

Hn = iE [Dii\Tn^,])^^^^^^^,n>l. 

Other fields of application can be considered for such procedures like the adaptive asset allocation 
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by an asset manager or a trader. Indeed this has already been done in [23] and successfully 
implemented with multi-armed bandit procedure. Imagine an asset manager who can trade the 
same financial instrument (tradable asset) on different trading venues. To optimize the execution 
of an inventory of this asset, she can split her orders across these trading destinations. She starts 
with the initial allocation vector Yq. At stage n, she chooses a trading destinations according to the 
distribution (1.2) of X„, then evaluates its performance during one time step and modifies the urn 
composition (most likely virtually) and proceeds. Thus the normalized urn composition represents 
the allocation vector among the venues and the addition rule matrices model the successive re- 
allocations depending on the past performances of the different trading destinations. 

One may also consider this type of procedure as a strategy to update the composition of a 
portfolio or even a whole fund, based on the (recent) past performances of the assets. 

The first designs under consideration were the homogeneous GFU models where the addition 
rules Dn are i.i.d. and the so-called generating matrices Hn = H = ED^ are identical, non- 
random, with nonnegative entries and irreducible. Hence by the Perron-Frobenius Theorem H has 
a unique and positive maximal eigenvalue and an eigenvector with positive components (see [2, 
3, 17, 18]). But the homogeneity of the generating matrix is often not satisfied in practice and 
inhomogeneous GFU models have been introduced (see [5]) in which Hn are not random but 
converge to a deterministic limit H, under the assumption that the total number of balls added 
at each stage is constant. As a third step, the homogeneous Extended Polya Urn (EPU) models 
have been introduced in [26] in which only the mean total number of balls added at each stage is 
constant. This number is called the balance of the urn and the urn is said balanced. 

Finally, in [6] the authors proposed a nonhomogeneous EPU model because in applications, 
the addition rule Dn depends on the past history of previous trials (see [1]), so that the general 
generating matrix Hn is usually random. Thus the entries of H may not be all nonnegative {e.g., 
when there is no replacement after the draw diagonal terms may become negative), and they 
assume that the matrix H has a unique maximal eigenvalue A with associated (right) eigenvector 
V* = (v*'*)j=i^...^rf with Yli=i ~ 1- Furthermore the conditional expectation of the total number 
of balls added at each stage was constant. 

The first theoretical investigations on these models focused on the asymptotic properties of 
the urn composition (consistency and asymptotic normality). However, for practical matter, it 
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is clear that the asymptotic behaviour of the vector Nn ■= Yl^=i -^k which stores the treatment 
allocation among the first n patients is of high interest, especially its variance structure in order to 
compare several adaptive designs. Thus, in [6] is proved the strong consistency of both (normalized) 
quantities l^/n and Nn/n (under a summability assumption on the generating matrices). 

By considering an appropriate recursive procedure for the normalized urn composition derived 
from (1.1) we prove by the ODE method its a.s. convergence toward v* under a significantly less 
stringent assumption, namely the minimal requirement that Hn — — > H. The a.s. convergence of 

n— >oo 

the treatment allocation frequency Nn/n toward the same v* follows from the previous one. 

As concerns asymptotic normality, separate results on these two quantities are obtained in [6] 
under an additional assumption on the rate of convergence of the generating matrices Hn toward H. 
On our side we propose to consider a stochastic approximation procedure with remainder satisfied 
by the higher dimensional vector (Yn/n, Nn/n). Then, the standard CLT for SA procedures with 
remainder directly provides the expected asymptotic normality result for the whole vector under an 
assumption on the L^-rate of convergence of the generating matrices towards their limit (namely 
i.e. \\\Hn — -f^lll = o{n^^/^)) which is again slightly less stringent than the original one. As a result, 
we obtain the asymptotic joint distribution with an explicit global covariance structure matrix. 

In the end of [6], an application to multi-arm clinical trials randomized urn models is proposed. 
This adaptive design has already been introduced in [7] with first consistency results. This kind of 
models is clearly the most interesting for practitioners since it takes into account the past results of 
the assigned treatments in the addition rule matrices, denoted 5„ at time n (5^ denotes the number 
of cured patients by treatment i among the Nn treated ones). The above strong consistency results 
apply but none of the asymptotic normality works as stated since the generating matrices Hn do 
not ~ in fact cannot as we will emphasize - converge at the requested rate. The reason being that 
they themselves satisfy a CLT. However we van overcome this obstacle by increasing once again the 
structural dimension of the problem: we show that the triplet (Yn/n, N^./n, Sn/n) can be written 
as a recursive SA algorithm with remainder satisfying a.s. convergence and a CLT (provided the 
limiting generating matrix is still irreducible, etc). Thus we illustrate on this example that SA 
Theory is a powerful tool to investigate this kind of adaptive design problem. The main difficulty is 
to exhibit the appropriate form for the recursion by making a priori the balance between significant 
asymptotic terms and remainder terms. 
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The paper is organized as follows. We rewrite the dynamics (1.1) of the urn composition as a 
stochastic approximation procedure with state variable for Y^, := in Section 2.1. In Section 2.2 
the a.s. convergence of ^ X]f=i ^ established which implies that of Yn and A'^^ := Nn/n by using 
the ODE method of SA under slightly lighten assumption than in [6]. The rate of convergence is 
investigated in Section 2.3: we obtain a CLT, once again under slightly less stringent assumptions 
on the limit generating matrix H than in [6]. Section 3 is devoted to multi-arm clinical tests. In 
Section 3.1 we briefly recall the Wei GFU model introduced [27, 7] where the generating matrices 
Hn are not random. In this case, the strong consistency and the asymptotic normality follow from 
the results of Section 2 (like in [6]). In Section 3.2 we study the adaptive design proposed in [7] 
where the addition rule matrices depend on the responses of all the past patients. We use the 
results from Section 2.2 to prove the strong consistency. We prove in Section 3.3 a new CLT for 
this model, when the generating matrix satisfies itself a CLT, which relies again on Stochastic 
Approximation techniques. 

Notations Vu = {u^)i=i^...^d £ \\u\\ denotes the canonical Euclidean norm of the column vector 
u on M.'^, w{u) = Ylk=i '^^ denotes its "weight", u* denotes its transpose; |||A||| denotes the operator 
norm of the matrix A G M^^qiM.) with d rows and q columns with respect to canonical Euclidean 
norms. When d = q, Sp(^) denotes the set of eigenvalues of ^. 1 = (1 • • • 1)* denotes the unit 
column vector in W^, 1^ denotes the dx d identity matrix and diag(n) = [6ijUi]i<ij<d^ where 6ij is 
the Kronecker symbol. 

2 Convergence and first rate result 

With the notations and definitions described in the introduction, we then formulate the main 
assumptions to establish the a.s. convergence of the urn composition. 
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(Al) 



(i) Addition rule matrix: For every n > 1, the matrix Dn a.s. has nonnegative 
entries. 

(ii) Generating matrix: For every n > 1, the generating matrices 
Hn = (-f^n )i<ij<d a.s. satisfies 



VjG{l,...,d}, Y.^n=c>0. 



1=1 



[Hi) Starting value: The starting urn composition vector Yq G ^% \ {0}- 

The constant c is known as the balance of the urn. In fact, we may assume without loss of 
generality, up to a renormalization of 1^, that c = 1: since 1^ = ^ and Dn+i = ^"^^^ , n > Q., 
formally satisfies the dynamics (1.1), namely 



Yn = Yn-1 + I?nX„, n > 1, G \ {0}. 

From now on, throughout the paper, we will considered this normalized balance version. Never- 
theless, we will still denote by Yn and the normalized quantities and assume that c = 1. 

(A2) The addition rule D„ is conditionally independent of the drawing procedure given Tn-\ 
and satisfies 



Vl<j<d, supE 

n>l 



< +CXD a.s. 



(2.3) 



where D'n^ = (£>n )i=i,...,d- 

(A3) Assume that there exists an irreducible d x d matrix H (with nonnegative entries) such that 



Hn 



H. 



(2.4) 



H is called the limit generating matrix. 

The combination of assumptions (A1)-(A3) guarantees that H satisfies the assumptions of 
the Perron- Frobenius Theorem (see [10]) so that 1 is the eigenvalue of H with the highest norm 
(maximal eigenvalue) and that the components of its right eigenvector v can be chosen all positive. 
Therefore, we may normalize this vector v* such that w{v*) = 1. 



A VARIANT INCLUDING POSSIBLE DEFINITE REMOVAL. We may relax Assumption (Al) by allow- 
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ing the removal of the drawn ball from its urn (see e.g. [19]). Other relaxation of these requirements 
may be considered: it could be possible to remove other balls than the drawn one. This leads to 
tenable urns (studied notably in [4], see also [24]) where an arithmetical assumption to the row of 
any negative diagonal entry in D„ is added, in order to avoid the urn extinction (see Assumption 
(A'l) below). Thus we may replace Assumption (Al) (after renormalization) by 



Addition rule matrix: For every iG {1, . . . ,d}, there exists Cj G (0, +c«) 
such that, for every n > 1, Vz, j S {1, . . . , d}, + D^^ G — a.s. 

Ci 



(A'l) = 



and \fj G {1, . . . , d}, Zti Dn > a.s. 

(a) Generating matrix: For every n > 1, Hn a.s. satisfies 

d 

VjG {!,... ,4, Y.^n=i- 

1=1 



{Hi) Starting value: The starting urn composition vector Yo£ f ~) \ 



N 



i=l 



In this case H may have negative (diagonal) entries and the Perron- Frobenius Theorem cannot 
be used, so we change Assumption (A3) into 

(A'3) 1 is the maximal eigenvalue of H and 3 u G M"^ \ {0} such that Hv = v. 



Throughout the paper, we may substitute (A'1)-(A'3) for (A1)-(A3) as recalled in each result. 

The following preliminary lemma ensures that if (A'l) holds then the urn extinction never 
occurs and its weight w{Yn) is non-decreasing. 

Lemma 2.1 (Preliminary). // (A'l) holds, then w{Yn) is non- decreasing and postive. 
Proof. We proceed by induction on n > 0. Assume Yn-i G ( H ~) ^'l^^- 

every i G {1, . . . , d}, 

i=\ 



K = K-x + E l{x„=en and {X„ = e^} C {Y^ > 0} = {YU > Vc,}- 



N 

Consequently y„* > y„*_i and G -\{0} on the event Uj^il^n = e^}- On {X„ = e*}, {Y^_^ > j-} 

Ci * 
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so that = y^.i + Z?;^ > ^ - ^ > 0. Finally 

d d 

i=l i=l 

2.1 The dynamics as a stochastic approximation procedure 

Our aim in this section is to reformulate the dynamics (1.1)-(1.2) into a recursive stochastic al- 
gorithm. Then we aim at applying the most powerful tools of SA^ namely the ^^ODE" and the 
^^SDE" methods to elucidate the asymptotic properties {a.s. convergence and weak rate) of both 
the urn composition and the treatment allocation. We start from (1.1) with Yq G M^j. \ {0}. For 
n > 1, 

Yn+l =Yn + Dn+lXn+l = Yn + E [D„,+lXn+l \ Fn] + AAf„+i, (2.5) 

where 

AM„+i := Dn^iXn+l — E [Dn+lXn+l I Fn] 

is an J>i-martingale increment. By the definition of the generating matrix H^, we have 

d d 
E [Dn+lXn+l \Fn] = E [Dn+l\{x„^,=e^}e' | J"™] = ^ E | Fn\ P = 6* | Fn) 

i=l i=l 

y* Y 



so that Yn+l = Yn + Hn+l . " X + AM„+i. 

lf^(>^) 

Now we can derive a stochastic approximation for the normalized urn composition Yn- First we 
have for every n > 1, 

Yn+l Yn I Yn K„ \ AM„+i 



n + 1 n n+1 \ w{Yn) n J n + 1 
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Consequently, Yn = — , n > 1, satisfies a canonical recursive stochastic approximation procedure 
n 

Yn+l = Yn + (Hn+l - h) Yn + ( AM„+i + ( — ^ - 1 ) Hn+lYn 

n + l n+l \ \w{Yn) J 

= Yn--^{h-H)Yn + -^{l^Mn+i+rn+i) (2.6) 
n+l n+l 

with step jn = ^ and a remainder term given by 

rn+l ■■= - A Hn+lYn + (Hn+l " F)y„. (2.7) 

Furthermore, in order to establish the a.s. boundedness of {Yn)n>i we will rely on the following 
recursive equation satisfied by wiYn): 

w{Yn+^) = w{Yn) + ^^^yf"^ + ^(AM„+i). 

By the properties of the generating matrix Hn+i, we obtain 

d d d d / d \ 

w{Hn+lYn) = Y,{Hn+lYn)^ = ^^1^. = E E ^"^1 ^" = 

i=l i=l j=l j=l \i=l / 

Consequently 

wiYn+i) = wiYn) + 1 + w{AMn+i). (2.8) 

2.2 Convergence results 

Theorem 2.1. Let (Yn)n>o be the urn composition sequence defined by (1.1)-(1.2). Under the 
assumptions (Al), (A2) and (A3) (or (A'l), (A2) and {A'^)), 

" n.->oo it;(y„) n->oo 



n n — ' n->oo 
fe=i 

Remarks. • We simply need that Hn — — > H while the assumption in [6] is 



71— >oo 



y "^"~^"- <+oo 

n 

n>l 
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where \\-\\^ is the norm on Lj^^xdW- 

• Assumption (A3) is not necessary to prove that ""^^"^ —>■ 



1. 



Proof. We win first prove that (a) =^ (6), then we wiU prove (a) 
(a) (b). We have 



E [X„ I Tn-l] = Yl 



n-l 



and, by construction ||X„|| = 1 so that E 



I \^ 11^ I TT 



1. Hence the martingale 



fe=l 



and by the Kronecker Lemma we obtain 



1 Yk-l a.s, 



k=l 



0. 



This yields the announced implication owing to the Cesaro Lemma, 
(a) First Step: We have 

d 

Therefore 



SO that E 



\Dr|.|-^X. 



n+l^n+l II \J'n 



d 



1 



{X„+i=eJ}) 



< sup sup E 

n>0 l<j<d 



•F n 



[Xn+l = I Jn) 



n 



< +00 a.s. 



Consequently sup„>;^E ||AAf„+i|| | < +oo a.s.. Therefore thanks to the strong law of large 
numbers for conditionally L^-bounded martingale increments, we have 0. Consequently 

it follows from (2.8) that 



w 



(y„) _ ^ wjYo) - 1 wjMn) 



n 



n 



n n— >cxD 



(2.9) 
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Second Step: Since the components of y„ = ^ are nonnegative and w(Yn) = ^ „ 1, it is 

'.'71 n^oo 

clear that (Yn)n>i is a.s. bounded and that a.s. the set 3^oo of all its limiting value is contained in 

V = w-^{l} = |u G I w{u) = 1} . 

So we may try applying the ODE method (see Appendix Theorem A.l). Since Yn and Hn+iYn are 
a.s. bounded, (2.9) and (A3) imply that r„ 0. 

n— >oo 

The ODE associated to the recursive procedure reads 



ODEi^.H = y = -{Id- H)y. 



Owing to Assumption (A3), I^ — H admits v* as unique zero in V. The restriction of ODEj^^h to 
the affine hyperplane V is the linear system i = — {Id — H)z, where z = y — v* takes values in Vq = 
{n G M"' I w{u) = 0}. Since Sp {{Id - i?) | Vo) C {A G C, 3?e(A) > 0}, owing to Assumption (A3). 
As a consequence v* is an uniformly stable equilibrium for the restriction of ODEj^^h to V, the 
whole hyperplane, as an attracting area. The fundamental result derived from the ODE method 
(see Theorem A.l in Appendix and the notations therein, in particular the remainder r„) yields 
the expected result 

Yn ^ V*. □ 

n— >-oo 

Remark: If we assume that the addition rule matrices {Dn)n>i satisfy besides (Al), then we can 
directly write a stochastic approximation for J^^ ^ with step ^ in which the remainder simply 
reads r„+i = {Hn+i — H) J^y ) ^i^d prove the a.s. convergence under the same assumptions. 

Comments. We could apply directly the ODE method because we first proved that {Yn)n>i is 
a.s. bounded without using the standard Lyapunov machinery developed in SA Theory. That is 
why the assumption on the remainder sequence (r„)„>i simply reads 

r„ ^ 0. 

n— >oo 

Another approach is the martingale one. It relies on the existence of a Lyapunov function V : M'^ — t- 
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M+ associated to the algorithm satisfying 



3a >0, VyGM'^, y^v*, {VV \ la - H) {y) > and (VF | - i/) > a |Vy|^ (2.10) 

In this framework the existence of a Lyapunov function can be estabhshed. Hence, the natural 
condition on the remainder sequence (r„)„>i reads (see [13]) 

11 ||2 

> — - — < +00 a.s. 
^ n 

n>l 

\\\}{ _ //|||2 

In that perspective, the assumption on the generating matrices would read > < +oo 

^-^ n 

n>l 

a.s. which is still slightly less stringent than assumption on the generating matrices made in [6]. 
2.3 Rate of convergence 

In the previous section we proved the a.s. convergence of both quantities of interest, namely Yn and 
Nn, toward v* . In this section we establish a "joint CLT" for the couple On '■= {Yn,NnY with an 
explicit asymptotic joint normal distribution (including covariances) . To this end we will show that 
On satisfies a SA recursive procedure which (a.s. converges toward 0* = {v*,v*Y and) fulfills the 
assumptions of the CLT Theorem A. 2 for SA algorithms (see Appendix), with a special attention 
paid to Condition (A. 22) about the remainder term. 
As concerns Yn, we derive from (2.6) that 

Vn > 1, Yn+l = Yn ^— (id - (2 - wiYn))H) Yn + ^— (AM„+i + rn+i) , 

n+lV / n+1 

_ ( Hn+l - H , {w{Yn) - 1)^ ^\ ~ 

where r„+i := \ H Yn. 



w{Yn) w{Yn 



For Nn we have, still for every n > 1, 



A^„+i = Nn ^— (Nn - (2 - w{Yn))Yn) + ^— (aM„+i + r„+i 

n+lV /n+lV 

Yn ^ (wCYn) - 1)^ ~ 

with AM„+i := Xn+i - E [Xn+i \ = X„,+i and r„+i := Yn 

w{Yn) w{Yn) 
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Thus, we obtain a new recursive SA procedure, still with step 'Jn = ^, namely 



On+i = 0n l-r^(^n) + (AM„+i + , n > 1, 

n + 1 n + 1 



with AM„+i := 



(- \ 

Tn+l 



ye 



I \ 

y 

V7 



, y G M'*, z/ G M*^, h{e) := 



and 



V 



I/ - {2-w{y))y 



with = 0. 



The function h is diffcrcntiable on M^'^ and its differential at point 6* is given by 



Dh{e*) 



h-H + v*l' 0^,( 



v*V - h 



To establish a CLT for the sequence {9n)n>i we need to make the following additional assumptions: 
(A4) The addition rules D„ a.s. satisfy 



VI < j < d, < 



sup„>i E 



I r)-j||2+5 I TT , 
l-'-^n n— 1 



E 



< C < GO for a (5 > 0, 



where C-^ = (Cj7)i<j,«<d, i = 1; • • • > "S?, are d x d positive definite matrices. 



Note that (A4) ^(A2) since E 
(A5) The matrix H satisfies 



n— 1 



2 
2+<5 



nE 0. 



Theorem 2.2. Assume (Al), (A3) (or (A'l), (A'3);, (A4) and (A5). 

(a) Assume furthermore that 

3fie(Sp(i?)\{l})<l/2. 



(2.11) 



(2.12) 
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Then, On — )• 0* a.s. as n ^ +00 and 



^{On-e*) AaA(0,S) with S= r"e"(^'^(^*)-i)re"(^'^(^*)"f)*cit/ 



and r 



k=l 



i.s.-lim E [AM^AM^ | T^- 



(2.13) 



(b) Denote by Amax the eigenvalue, different from 1, of H with the highest real part. If Xmax = 1/2, 
then On — 7- 0* a.s. as n ^ +00 and 



n 



log n 



1 A AA(0,S). 



(c) // Amax > 1/2, i/ien ^„ — t- 0* a.s. as n ^ +00 and (0„ — 0*) a.s. converges as n ^ +00 
towards a finite random variable, where (3 = 1 — Amax • 

Proof, (a) We will check the three assumptions of the CLT for SA algorithms recalled in the 
Appendix (Theorem A. 2). Firstly, the condition (A. 23) on the spectrum of Dh{0*) requested 
for algorithms with step ^ in Theorem A. 2 reads 3f?e (Sp(-D/i(0*))) > ^. This follows from our 
Assumption (2.12) since by decomposing = Mf* © Ker(ti;), one checks that 

Sp(I)/i(r)) = {1} u {1 - A, AG Sp(i/) \ {1}}. 

Secondly Assumption (A4) ensures that Condition (A. 21) is satisfied since 



supE 

n>l 



|AM„f+^ \Tn-l 



< +00 a.s. and E rAM„AM^ | Tn-i] ^ T as n 00, 



where T is the symmetric nonnegative matrix given by (2.13) as established below. To this end we 
have to determine three blocks since T reads 



/r r ^ 

i 1 J- 12 

r2 



where ri,r2,ri2 G Mc 
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Computation of Ti . 



9=1 



n— ^-oo 



9=1 



Computation ofT2- 



E 



diag 



^ n I ^ n \ a.. 



W 



Yn \ Yn / Yfi \ a.s. _ j- / *\ */ *\t 
TTTT 7T7T " 7T7T " ^ ^2 = diag V - W V 



Computation ofTu. 



E 



AM„+iAAf*+i I 



— E [Dn+lXn+lX^^_^^i I J^n] — E [Dn+lXn+1 \ J^n] E [Xn+1 \ J^n] 
= E [Dn+l I Tn] E I 

-E [Dn+l I Tn] E [Xn+1 I Tn] E [X^+l \ Tnf 

= F„+idiag ( — ^ I - Hn+i-^- ( — ^ ) 
^ r^2 = H{dmg(vn-v*{v*Y). 



Finally, it remains to check that the remainder sequence {Rn)n>i satisfies (A. 22) for an e > 0: 



E 



(n + 1) ll-Rn+ill^ l{\\e„-e*\\<e} 



0. 



(2.14) 



We note that = ||r„+i||^ + ||r„+i||^. It follows from the definition of r^+i and the elementary 
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facts \\Yn -v*\\< WOn - 9*\\ and w{Yn) > ||y„|| that 



< 2 (-(y-^)%'''^-Y-^iin^Kiii, 

\^ Ihlli M J 2 [\\9n-e*\\<^^j 



< 6{iwiYn)-l)^ + \\\Hn+l-H\\\^)l 



L{|i,„_,.|i<^}- 



But w{Yn) - 1 = ^^^i^^ where sup„>oE \w{AMn+i)\^^^ \ J^n <C',6> 0, owing to (A4). Now 
using that \w{y)\ < Cd\\y\\, 



E 



n 



wiYn) - 1 



{\\en-e*\\<^} 



< C?nE 



w(Yn) - 1 



2+5' 



a. 



n 



1+5 



E 



\2+5 



< 



1+5 ■■ 



n 



where C? > is a real constant. Consequently 



E 



w{Yn) - 1 



-{l|e„-e*||<^} 



o I - 

n 



Thus, by (A5) we obtain 



E 



kn+l|| 1 



{\\en-e*\\<i^} 



n 



The same argument yields E 



{l|e„-e-||<lli^} 



n 



, therefore (2.14) is satisfied. 



(b)-(c) Follows from Theorem A. 2 (6)-(c) in the Appendix (see also [14]). 



□ 



3 Application to urn models for multi-arm clinical trials 

In this section, we consider urn models for multi-arm clinical trials introduced by Wei and general- 
ized by Bai, Hu and Shen. In this context, the initial framework where the addition rule matrices 
have nonnegative entries is the only one to make sense. 

3.1 The Wei GFfJ Model 

We consider here the model presented in [27] and in [7], where balls are added depending on 
the success probabilities of each treatment. Define an efficiency indicator as follows: let (T^)„>i, 
1 < i < d, he d independent sequences of [0, l]-valued i.i.d. random variables, independent of the 
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i.i.d. sampling sequence {Un)n>i so that 

K[Tl]=p\ < < 1, 1 < i < d (3.15) 

Remark. If (r^)„>i, 1 < i < d, is simply a success indicator, namely d independent sequences 
of i.i.d. {0, l}-valued Bernoulli trials with respective parameter p*, then the convention is to set 
T^ = l to indicate that the response of the i^^ treatment in the n*'^ trial is a success and = 
otherwise. 



In this framework one considers the filtration J^n = c (Yq, Uk,Tk,l < k < n), n > 0. Consider 
the following addition rules: a success on the treatment i adds a ball of type i to the urn and a 
failure on the treatment i adds ^y^y balls for each of the other d — 1 types. Thus the addition rule 
proposed in [27] is as follows 



1 — 



-'- -'n + l 

d-1 



7^2 



1-TJ 



1_T1 1 — T2 

1 ^ -'n+l -'n + l 



n+l \ 



d-\ 



1 — T^* 

-'n + l 



SO that 



-ffn+l — IE [-Dn+1 I 



ED, 



n+l 



if 



/ 1 



q 

d-l 



\jiL jt_ 

\d-l d-l 



d-l 



q 

d-l 



pd J 



where = 1 — , 1 < i < d. The strong consistency has been first established in [3], then redone 
in [6]. It follows from Theorem 2.1 as well. The asymptotic normality 



nv 



\ n 



AA(0,S) 
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results from Theorem 3.2 in [6] and from Theorem 2.2 of this paper. However using Theorem 2.2 
we obtain a joint CLT for (1^, N^)- Furthermore we know that 

^« = Vg' l<i<d. 

Note that if > p-', then > u*-^. Hence the components v*^ are ordered according to the 
increasing efficiency of the treatments. Furthermore, it is clear that, if f 1 and all other 
probabilities -{P stand still, then 

lim v*-^ = 5ij. 

Consequently, since v** is the asymptotic probability of assigning treatment z to a patient, the 
procedure asymptotically allocates more patients to the most efficient treatment (s). Following the 
practitioners, the fact that a marginal allocation of less efficient treatments is preserved is justified 
by some comparison matter. 

However this model only takes into account in the addition rule matrix D„ the response of the 
n*'^ patient without considering the ones of past patients. This led the author to introduce [7] a 
new model based on statistical observations of the efficiency of the assigned treatments to all past 
patients. 

3.2 The Bai-Hu-Shen GFU Model 

We consider now the model introduced in [7] (and considered again in [6]) where (T^)„>i,l < i < d, 
are d independent sequences of i.i.d. {0, l}-valued Bernoulli trials satisfying (3.15) and the filtration 
{J^n)n>o is defined as in the previous section. Let iV„ = (A^^, . . . , N^f and Sn = {S^, • • • , 5^)*, where 
= N^_i + X^, n > 1, still denotes the number of times the i*'' treatment is selected among the 
first n stages and 

denotes the number of successes of the i*'* treatment among these N^^ trials, i = 1, . . . , d. However, 
to avoid degeneracy of the procedure, we will make the following initialization assumption 

A^^ = 1,5^ = 1, i = l,...,d 
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(which makes the above mterpretation of these quantities correct "up to one unit"). 

Remark. Like with the Wei model, we can simply assume that is a {0, l}-valued efficiency 
indicator. 



Define II^ = (11^, . . . , H^)*, where = i = I, . . . ,d. In [7] the authors consider the 
following addition rule matrices, 



n+l 



n+l 



i.e. at stage n + l, if the response of the j'*'* treatment is a success, then one ball of type j is added 
in the urn. Otherwise, — ^^^p- (virtual) balls of type i, i ^ j, are added. This addition rule matrix 
clearly satisfies (Al)-(i) and (A2). Then, one easily checks that the generating matrices are given 
by 



/ 1 n^(i-p^) 



n^(i-p'') \ 



n^(i-p') n^(i-p') 



and satisfy (A1)-(m). As soon as e M+ \ {0}, (see Lemma 3.1 below or [7] when 

Yo £ (0, oo)"') where 



H 



( pi 



y:,^2p> 



p 



pHi-p'') \ 
^j^dP' 



^j^dp' 



pd(l_pl) pd(^_p2) 
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The matrix H is clearly irreducible since < < 1, 1 < i < d so that Assumption (A3) is 
satisfied. Then calling upon Theorem 2.1 (or following the direct proof from [7]) we obtain 



^ Yn a.s. * J i\r * I'o i c^ 

y„ = — — > V and Nn = — > v . (3.16) 

11 n— >oo fi n— ^oo 

Note that the normalizes maximal eigenvector v* (associated to the eigenvalue 1) is given by 



EpJ k ' 

l<j<d ^k^j P 



l,...,d. 



Note that if p* > p^, Sl^i^iiL > i and ^— ^ > 1 so that v*^ > v*K Hence the entries u** are 
ordered according to the increasing efficiency of the treatments. This model can be considered 
as more ethical than the Wei model since a better treatment will be administrated to more patients. 
Indeed, when d > 2, for any i ^ j, 1 < i, j < d, ii > , 

^BHS y ^ y I 
'^BHS '"W 

(when d = 2 both matrices H coincide). 

Remark. Note that in that model the "balls" in the urn become virtual since there exists no 
iV G N such that, for every n > 1, NDn € A^rf(N). 



3.3 Asymptotic normality for multi-arm clinical trials for the BHS GFU model 

In [7] in order to derive a CLT, not with the bias El^ but with nv*, from their own general 
asymptotic normality result (which statement is similar to Theorem 2.2) they need to fulfill the 
following convergence rate assumption for Hn 

n>l ^ 

where \\-\\^ is the norm on L^^xdi^)- In [7], the a.s. rate of decay — iJ|||oo = o(n~4) is shown 
which is clearly not fast enough to fulfill (3.17). 
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However, by enlarging the dimension of the structure process of the procedure by considering 
the 3d-dimensional random sequence 



n 

\SnJ 



where Sn = — , n > 1, 
n 



we will establish that a CLT does hold for the BHS GFU model. 

The first step is to notice that the generating matrix H^+i can may be written as a function de- 
pending on Sn and Nn, i.e. Hn+i = $(S'„, Nn), where $ : x (0, oo)'' Md{^) is a differentiable 
function defined by 



$(s,i/) = ($'^(s,zy))^^._^.^^ where < 



$»»(s, v) = 1 <i < d 



Then the following strong consistency and CLT hold for {6n)n>i. 

Theorem 3.1. Assume that e \ {0}. (a) //sfte(Sp(i?) \ {!}) < ^, then 



9n ^ e* and ^ (On -6*) N f 0, , 



where 



e*:={v*,v*,dmg{p)v*f, E= f^"" e<^W)-h)Ve<''~^(^*)-iy du 

Jo 



with 



/ d \ 
^v*''C'' -v*{v*Y H{diag{v*)-v*{v*Y) (diag^) - diag(p) 

k=l 



(diag(u*) — v*{v*yy diag(t;*) — v*{v*y (diag(i;*) — v*{v*y) diag(p) 



ydiag(^») (diag(w*) - v*{v*yy diag(p) (diag(u*) - diag(p) [v* - v*'u**diag(p)) j 
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where = {Cij)i<i,j<d, I < k < d, are d x d positive definite matrices with 



and 



Dh{e*) 



Id-H + v*l^ 



v*l'-h 



\diag{p) {v*l'-h) 



0^. 



Id 



which is invertihle. 



(b) Denote by Amax the eigenvalue, different from 1, of H with the highest real part. //Amax = 1/2, 
then, On —7- 0* a.s. as n ^ +oo and 



n 



log n n->oo 



AA(0,E). 



(c) // Amax > 1/2, i/ien — 9*) a.s. converges as n ^ +oo towards a finite random variable, 

where 13 = 1- Amax- 



Proof. Step 1 (Strong consistency). We will show with Lemma 3.1 that Sn — — > diag{p)v* 



and we will deduce that Hn —4 H, i.e. Assumption (A3) holds. As we have already checked 

n— >oo 

that Assumptions (Al)-(i)-(ii) and (A2) are satisied, then by only adding (Al)-(izi) we use 
Theorem 2.1 to prove that 9n 9* ■ 



Lemma 3.1. // the assumption (1.1) holds and Yq G 1^+ \ {0}, then, 

II„ — >p=(p ,...,p ) asn^oo 
so that Assumption (2.4) holds i.e. Hn — — > H. 

n— ^oo 

Remark. If we assume that 1^* > 0, 1 < i < d, then we can prove that limnA'^^ = +oo a.s., 
1 < i < d, faster than below by using that Yn > Yq, 1 < i < d, n > 1. The following proof 
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considers the more general case where G \ {0}. 

Proof of Lemma 3.1. Step 1. It follows from the dynamics (1.1) and the definitions of Dn+i and 
Hn+i that, for every n > 0, wiYn) = ^^(^o) + and that, for every {1, . . . , d}, 

where (AM^)„>i is a sequence of martingale increments satisfying sup„E [|AM^p | -^n-i] < +oo 
since the addition rule matrices satisfy (2.3). Now using that Sq = Nq = 1 by convention, one 
derives that 

Vi ^ j, Hi^^, > ^, with Ko = ^ mm {p\ I - p') > Q 
so that, using that i^^^i ~ -P*' there exists a deterministic integer no such that for every n > uq, 



n w{Yn)'' " n 
2w{Yn)J " n 



Standard computations show that, setting d!^ = YYk^noi^ ~^ 2w{Y ) ) > ^ = 1; • • • ; 



Vn>no, -r>-^+ 2^ -r+ L 



A: 



Since there exists ki, K2 > such that Kin 2 < aj^ < ^2^, 2 , one has 

^ AMI / 
Vr/>0, > ^^ = o(n 2 ). 
a. 

fe=no+l 

Finally, there exists a positive real constant c' such that, for every i = 1, . . . , d, 



r „ > c n 2 > A; 2 + o[n 2 j 

A;=?io+l 
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so that 

VzG{1,...,4, liminfy^ > c' / ti^^du > 
and, as a consequence, 5^„>i = +00 a.s. Now using that for every i = \, . . . ,d, 

^n = Y.Hx,=e^} and W{Xr, = e'\Fn-i) = n-i{^-:^^^, n>l, 

we get by the conditional Borel-Cantelh Lemma that = hm„ A/^ = +00 a.s. 
Step 2. First we note that 

n 

and we introduce the sequence (n„)„>i defined by 

~ " A AT* 

k=l ^^k-l + ^ 

It is an J>i-martingale since, being independent of J^k-i a^id X^, 

e{{ti - p')ANi I Tk-i) = nn - p')nxk = e' \ j^^-i) = o. 

It has bounded increments since \T^ — < 1 and 

~, - Ei{ANir\:F,.,) 

It fohows, using {ANlf = ANl, that, for every n > 1, 

ANi \ ,„/A ANi 



r < ^ = 1. 



Consequently 11^ — ^ 11^ G L^(P) a.s. as n — )■ 00. This in turn implies by Kronecker's Lemma that 

„ — > p as n ^ 00 

since AT^ ^ 00 by the first step. □ 
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It follows from the lemma and Theorem 2.1 that {Yn,Nn) — )• {v*,v*). Furthermore diag(S'„) = 
diag(Q„)A'^„ — )• diag{p)v* = u* so that 0„ — )• 61* as n — )• +00. 

Step 2 (Asymptotic normality). We will show now that (^n)n>i satisfies an appropriate recursion 
to apply Theorem A. 2 (CLT). First, we write a recursive procedure for Sn- Having in mind that 

Sn = l+E 



+1 



Sn 
Sn 



1 



n + 1 
1 

n + 1 
1 

n + 1 



Sn - diag(T„+i)X„+i 

Yn 



Sn - diag(p) 



wiYn 



+ 



1 



n + 1 



-AM 



n+1 



1 



Sn - diag(p)(2 - w{Yn))Yn ) + ( AM„+i + f„+i ) (3.18) 



where AM„+i := diag(r„+i)X„+i - E [diag(r„+i)X„+i | = diag(r„+i)X„+i - diag(p) 



Y 

J- n. 



w{Yn 



is an J>;-martingale increment and r„,+i = diag(p) ^"'^'^"2 ^'^ Yn- Then we rewrite the dynamics 



satisfied by y„ as follows 



Yn+l =Yn \- (id - (2 - w{Yn))Hn+l) Yn + (AM„+i + f„+i) , (3.19) 

n+lV / n+1 



(w{Yn) - 1 

where fn+i '•= —Hn+i^n- Finally, we get the following recursive procedure for 6n 

w{Yn) 



yn+l = f7n - ^ h{On) H 7— (aM„+i + Rn+1 ) , n > 1, 



n + 1 



n + 1 



where, for every 6 = (y, s)* G 



+ ' 



h{e) :-- 



^{h-{2-w{y))^{s,v))y^ 
v-{2- w{y))y 
s - (2 - w{y))d\a.g{p)y J 



, AM„+i 



^ AM„+l^ 

AM„+i 
yAMn+i J 



and Rn+i := 



^n+l 
rn+1 
\rn+l j 



Let us check that the addition rule matrices satisfy (A4). For every j G {1, . . . let set Cn 
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E 



We have that 



E 



QlrQ 



I 



1 



{i=i'=j} 



because T^_^^{1 — T^_^_i) = 0. Then owing to Lemma 3.1, Ci with 



T^{i,i'^j}+P'^{i=i'=j}- 



We can check that is a positive definite matrice. Consequently (A4) holds. 
The function $ being differentiable at the equihbrium point 9*, we have 



Dh{e*) 



Id-H + v*P 



v*r - Id 



=e* a. 



\diag{p) {v*l^ - Id) ^Mm Id J 

which is invertible since by Schur complement we have det{Dh{0*)) = det(/rf — H + thanks 
to I; ($(s,z^)y)|g^^. = -diag(p)^ 

At this stage, the proof follows the lines of that of Theorem 2.2: the computation of the 
covariance matrix T and the treatment of the remainder term uses the same tools as before. The 
three results of convergence rate follows from Theorem A. 2 in the Appendix. The details are left 
to the reader. □ 

Remark. The asymptotic variances of Yn and Nn in Theorem 3.1 are different from those in 
Theorem 2.2 because the differential matrices Dh{6*) and Dh{9*) are not the same. 

Corollary 3.1. Under the assumptions of Theorem 3.1, 
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where Th is a (P x (f matrix given by Th = D^{u* ,v*)[T.i^d,j+d]i<i,j<2dD^{u* ,v*Y . 
Proof. This is an easy consequence of the so-called A-method since 



Hn = HSn, Nn) = Hu*, V*) + D^u*,v*).{Sn - u* , - V*) + || (5„ - u* , - v*)\\e{Sn, Nn) 
with limj^_^(-„. = 0. Consequently 

V^{Hn -H)= D^u\v*).{./^(Sn - U*),V^iNn - V*)) + 

where ep(n) goes to in probability (as the product of a tight sequence and an a.s. convergent 
sequence). This concludes the proof. □ 

Remark. This corollary shows a posteriori that it was hopeless to try applying Theorem 2.2 in its 
standard form to establish asymptotic normality for multi-arm clinical trials since the assumption 
(A5) cannot be satisfied. Our global SA approach breaks the vicious circle. 
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Numerical Example: BHS model. We consider the case d = 2, so v* as the same form as in 
the example in Subsection 2.3. 
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Figm'e 1: Convergence of — toward v* (up-windows) and of — toward v* (down-windows): d = 2, 
n = 2.103, pi = 0.5, p2 = 07, Yq = (0.5,0.5)* and Nq = (1, 1) " 



Appendix 



A Basic tools of Stochastic Approximation 

Consider the following recursive procedure defined on a filtered probability space {VL,A, {J^n)n>Q-,^ 



'in>nQ, 9n+i = 0n- 7„+i/i(6'„) + 7„+i (AM„+i + r„+i) , 



(A.20) 
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where /i : M — J- R is a locally Lipschitz continuous function, an J>ig -measurable finite random 
vector and, for every n > no, AMn+i is an J>i-martingale increment and is an J>^-adapted 
remainder term. 

Theorem A.l. (A.s. convergence with ODE method, see e.g. [9, I4, 22, 16, 8]). Assume that h 
is locally Lipschitz, that 



fn and sup E 

n>no 



< +00 U.S., 



and that (7n)n>i is a positive sequence satisfying 



^ 7n = +00 and ^ 7^ < +00. 



E 

n>l n>l 



Then the set Q°° of its limiting values as n ^ +00 is a.s. a compact connected set, stable by the 
flow of 

ODEh = 9 = -h{e). 
Furthermore if 9* G 0°° is a uniformly stable equilibrium on G°° of ODE^, then 



Comments. By uniformly stable we mean that 

sup 1 61(6*0, t) - 61*1 — ^0 as t +00, 
where 6'(6'o, t)e(,ge°°,tGK+ is the flow of ODEh on Q°° . 

Theorem A. 2. (Rate of convergence see [I4] Theorem 3.III.14 p. 131 (for CLT see also e.g. [9, 
22])). Let 9* be an equilibrium point of {h = 0}. Assume that the function h is differentiable at 9* 
and all the eigenvalues of Dh{9*) have positive real parts. Assume that for some 5 > 0, 



sup E 

n>no 



< +00 U.S., E rAM„+iAM* 1 I J"J ^ r, (A.21 
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where T is a deterministic symmetric definite positive matrix and for an e > 0, 



E 



(n + 1) ||rn+if l{||g^_g*||<,} 



(A.22) 



Specify the gain parameter sequence as follows 



Vra > 1, 7„ 



n 



(A.23) 



(a) If A := 3fie(Aniin) > ^, where Xmm denotes the eigenvalue of Dh{9*) with the lowest real part, 
then, the above a.s. convergence is ruled on the convergence set {9n — > 0*} by the following Central 
Limit Theorem 



n{9n 



^ Q' 2A- 1 ^ ' ^ 



du. 



(6) //A = i then 



log n n-)-oo 



(c) // A < then {6n — 9*) a.s. converges as n ^ +oo towards a finite random variable. 
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